NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Faster Neighborhood Attention: Reducing the O(n^2 ) Cost of Self Attention at the Threadblock Level

Hassani, Ali; Hwu, Wen-mei; Shi, Humphrey (December 2024, NeurIPS 2024)

Neighborhood attention reduces the cost of self attention by restricting each token’s attention span to its nearest neighbors. This restriction, parameterized by a window size and dilation factor, draws a spectrum of possible attention patterns between linear projection and self attention. Neighborhood attention, and more generally sliding window attention patterns, have long been bounded by infrastructure, particularly in higher-rank spaces (2-D and 3-D), calling for the development of custom kernels, which have been limited in either functionality, or performance, if not both. In this work, we aim to massively improve upon existing infrastructure by providing two new methods for implementing neighborhood attention. We first show that neighborhood attention can be represented as a batched GEMM problem, similar to standard attention, and implement it for 1-D and 2-D neighborhood attention. These kernels on average provide 895% and 272% improvement in full precision runtime compared to existing naive CUDA kernels for 1-D and 2-D neighborhood attention respectively. We find that aside from being heavily bound by memory bandwidth, certain inherent inefficiencies exist in all unfused implementations of neighborhood attention, which in most cases undo their theoretical efficiency gain. Motivated by the progress made into fused dot-product attention kernels, we developed fused neighborhood attention; an adaptation of fused dot-product attention kernels that allow fine-grained control over attention across different spatial axes. Known for reducing the quadratic time complexity of self attention to a linear complexity, neighborhood attention can now enjoy a reduced and constant memory footprint, and record-breaking half precision runtime. We observe that our fused implementation successfully circumvents some of the unavoidable inefficiencies in unfused implementations. While our unfused GEMM-based kernels only improve half precision performance compared to naive kernels by an average of 548% and 193% in 1-D and 2-D problems respectively, our fused kernels improve naive kernels by an average of 1759% and 958% in 1-D and 2-D problems respectively. These improvements translate into up to 104% improvement in inference and 39% improvement in training existing models based on neighborhood attention, and additionally extend its applicability to image and video perception, as well as other modalities.
more » « less
Full Text Available
Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level

https://doi.org/10.52202/079017-2065

Hassani, Ali; Hwu, Wen-mei; Shi, Humphrey (January 2024, Neural Information Processing Systems Foundation, Inc. (NeurIPS))

Full Text Available
Monocular Facial Presentation–Attack–Detection: Classifying Near-Infrared Reflectance Patterns

https://doi.org/10.3390/app13031987

Hassani, Ali; Diedrich, Jon; Malik, Hafiz (February 2023, Applied Sciences)

This paper presents a novel material spectroscopy approach to facial presentation–attack–defense (PAD). Best-in-class PAD methods typically detect artifacts in the 3D space. This paper proposes similar features can be achieved in a monocular, single-frame approach by using controlled light. A mathematical model is produced to show how live faces and their spoof counterparts have unique reflectance patterns due to geometry and albedo. A rigorous dataset is collected to evaluate this proposal: 30 diverse adults and their spoofs (paper-mask, display-replay, spandex-mask and COVID mask) under varied pose, position, and lighting for 80,000 unique frames. A panel of 13 texture classifiers are then benchmarked to verify the hypothesis. The experimental results are excellent. The material spectroscopy process enables a conventional MobileNetV3 network to achieve 0.8% average-classification-error rate, outperforming the selected state-of-the-art algorithms. This demonstrates the proposed imaging methodology generates extremely robust features.
more » « less
Full Text Available
CSTR: A Compact Spatio-Temporal Representation for Event-Based Vision

https://doi.org/10.1109/ACCESS.2023.3316143

El_Shair, Zaid A; Hassani, Ali; Rawashdeh, Samir A (January 2023, IEEE Access)

Full Text Available
Efficiently Mitigating Face-Swap-Attacks: Compressed-PRNU Verification with Sub-Zones

https://doi.org/10.3390/technologies10020046

Hassani, Ali; Malik, Hafiz; Diedrich, Jon (April 2022, Technologies)

Face-swap-attacks (FSAs) are a new threat to face recognition systems. FSAs are essentially imperceptible replay-attacks using an injection device and generative networks. By placing the device between the camera and computer device, attackers can present any face as desired. This is particularly potent as it also maintains liveliness features, as it is a sophisticated alternation of a real person, and as it can go undetected by traditional anti-spoofing methods. To address FSAs, this research proposes a noise-verification framework. Even the best generative networks today leave alteration traces in the photo-response noise profile; these are detected by doing a comparison of challenge images against the camera enrollment. This research also introduces compression and sub-zone analysis for efficiency. Benchmarking with open-source tampering-detection algorithms shows the proposed compressed-PRNU verification robustly verifies facial-image authenticity while being significantly faster. This demonstrates a novel efficiency for mitigating face-swap-attacks, including denial-of-service attacks.
more » « less
Full Text Available
FarSight: A Physics-Driven Whole-Body Biometric System at Large Distance and Altitude

https://doi.org/10.1109/WACV57701.2024.00611

Liu, Feng; Ashbaugh, Ryan; Chimitt, Nicholas; Hassan, Najmul; Hassani, Ali; Jaiswal, Ajay; Kim, Minchul; Mao, Zhiyuan; Perry, Christopher; Ren, Zhiyuan; et al (January 2024, IEEE)

Full Text Available
A New Data Association Method Using Kalman Filter Innovation Vector Projections

https://doi.org/10.1109/PLANS46316.2020.9110229

Joerger, Mathieu; Hassani, Ali (April 2020, 2020 IEEE/ION Position, Location and Navigation Symposium (PLANS))
null (Ed.)
This paper describes the derivation, analysis and implementation of a new data association method that provides a tight bound on the risk of incorrect association for LiDAR feature-based localization. Data association (DA) is the process of assigning currently-sensed features with ones that were previously observed. Most DA methods use a nearest-neighbor criterion based on the normalized innovation squared (NIS). They require complex algorithms to evaluate the risk of incorrect association because sensor state prediction, prior observations, and current measurements are uncertain. In contrast, in this work, we derive a new DA criterion using projections of the extended Kalman filter's innovation vector. The paper shows that innovation projections (IP) are signed quantities that not only capture the impact of an incorrect association in terms of its magnitude, but also of its direction. The IP-based DA criterion also leverages the fact that incorrect associations are known and well-defined fault modes. Thus, as compared to NIS, IPs provide a much tighter bound on the predicted risk of incorrect association. We analyze and evaluate the new IP method using simulated and experimental data for autonomous inertial-aided LiDAR localization in a structured lab environment.
more » « less
Full Text Available
Experimental Integrity Evaluation of Tightly-Integrated IMU/LiDAR Including Return-Light Intensity Data

https://doi.org/10.33012/2019.17095

Hassani, Ali; Morris, Nicholas; Spenko, Matthew; Joerger, Mathieu (October 2019, ION GNSS+, The International Technical Meeting of the Satellite Division of The Institute of Navigation)
null (Ed.)
Full Text Available
LiDAR Data Association Risk Reduction, Using Tight Integration with INS

https://doi.org/10.33012/2018.15976

Hassani, Ali; Joerger, Mathieu; Arana, Guillermo Dueas; Spenko, Matthew (October 2018, ION GNSS+, The International Technical Meeting of the Satellite Division of The Institute of Navigation)

Full Text Available

Search for: All records